Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[python] Create xarray backend for DataArray types #3243

Merged
merged 14 commits into from
Nov 4, 2024

Conversation

jp-dark
Copy link
Collaborator

@jp-dark jp-dark commented Oct 25, 2024

Issue: Closes #3242

Changes:

Adds an Xarray BackendArray wrapper and DataStore for the SOMADenseNDArray. This allows the user to open a single SOMADenseNDArray as an Xarray Dataset with a single DataArray.

This is important because we already use (and need) the SpatialData package, and it uses Xarray objects.

Copy link

codecov bot commented Oct 25, 2024

Codecov Report

Attention: Patch coverage is 94.33962% with 3 lines in your changes missing coverage. Please review.

Project coverage is 84.92%. Comparing base (945804c) to head (e28bd92).
Report is 12 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3243      +/-   ##
==========================================
+ Coverage   83.81%   84.92%   +1.11%     
==========================================
  Files          51       52       +1     
  Lines        5578     5613      +35     
==========================================
+ Hits         4675     4767      +92     
+ Misses        903      846      -57     
Flag Coverage Δ
python 84.92% <94.33%> (+1.11%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
python_api 84.92% <94.33%> (+1.11%) ⬆️
libtiledbsoma ∅ <ø> (∅)

@johnkerl johnkerl changed the title Create xarray backend for DataArray types [python] Create xarray backend for DataArray types Oct 25, 2024
@jp-dark jp-dark requested a review from johnkerl October 28, 2024 14:08
@jp-dark jp-dark force-pushed the dark/xarray-backend/sc-57982 branch from 9bee8a8 to bf94672 Compare October 28, 2024 14:52
@johnkerl johnkerl changed the title [python] Create xarray backend for DataArray types [python] Create xarray backend for DataArray types Oct 28, 2024
@jp-dark jp-dark requested a review from nguyenv October 30, 2024 17:46
Copy link
Collaborator

@ivirshup ivirshup left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to see some demonstration or ideally testing that makes sure that memory usage matches our expectations with lazy loading.

As ideas for testing, I can recommend memray's pytest plugin or even just logging whenever a tile is accessed and making sure you aren't accessing unneeded tiles when using the xarray interface.

apis/python/tests/test_basic_xarray_io.py Outdated Show resolved Hide resolved
apis/python/tests/test_basic_xarray_io.py Show resolved Hide resolved
@jp-dark jp-dark force-pushed the dark/xarray-backend/sc-57982 branch from e9bcc79 to ef6ac6f Compare October 30, 2024 20:19
@ivirshup
Copy link
Collaborator

^ I would be happy to have performance monitoring pushed to a later PR if it gets us to spatialdata export sooner

@jp-dark
Copy link
Collaborator Author

jp-dark commented Oct 31, 2024

^ I would be happy to have performance monitoring pushed to a later PR if it gets us to spatialdata export sooner

I added issue #3267 to track this.

@jp-dark jp-dark force-pushed the dark/xarray-backend/sc-57982 branch from 4df6528 to e7ddebc Compare November 1, 2024 17:02
Copy link
Member

@johnkerl johnkerl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with suggested changes. (I could 'request changes' but I trust you to make them -- if you accept my suggestions as-is. If not, happy to iterate in further conversation.)

My main point of confusion reading the unit-test cases was keeping track what is the datatype of each variable. My comments are all around naming that stands out more clearly.

apis/python/tests/test_basic_xarray_io.py Outdated Show resolved Hide resolved
apis/python/tests/test_basic_xarray_io.py Outdated Show resolved Hide resolved
apis/python/src/tiledbsoma/experimental/_xarray_backend.py Outdated Show resolved Hide resolved
@jp-dark jp-dark merged commit 001e60d into main Nov 4, 2024
12 checks passed
@jp-dark jp-dark deleted the dark/xarray-backend/sc-57982 branch November 4, 2024 17:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[python] Create xarray backend for SOMADenseNDArray
4 participants